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In this note we prove that the local martingale part of a convex function / of a 
d-dimensional semimartingale X = M + A can be written in terms of an Ito stochastic 
integral f H(X)dM, where H(x) is some particular measurable choice of subgradient 
V/(x) of / at x, and M is the martingale part of X. This result was first proved by 
Bouleau in [2J. Here we present a new treatment of the problem. We first prove the 
result for X = X + eB, e > 0, where B is a standard Brownian motion, and then pass 
to the limit as e — > 0, using results in [1] and [3J. 

1 Introduction 

Consider a general convex function / : M d — > M, not necessarily everywhere differ- 
entiable. Every differentiable point x 6 M. d has a unique tangential hyperplane, while 
at non-differentiable points there will be a whole variety of different supporting hyper- 
planes. For a continuous semimartingale X with decomposition X = M + A we prove 
that the (local) martingale part of f(X) can be expressed in terms of a stochastic in- 
tegral of a measurable selection of subgradient V/(X) against M. For piecewise linear 
1-dimensional convex functions this follows from the Meyer- Tanaka formula. In particu- 
lar for f(x) = \x\ we have V/(x) = sgn(x), where sgn(x) = —1 if x < and 1 otherwise, 
which is just the left hand side derivative. So at the origin, which is the only point 
where derivative is not defined, we can take the supporting line to be y = —x. 

The main result of this note is the following 

Theorem 1. Let / : l rf -)■ 1 i»e convex and let X be a continuous M. d -valued semi- 
martingale with Meyer decomposition Xt = Xq + Mt + At which is defined on filtered 
probability space J-, {J~}t>o, IP)- Then f(Xt) is again a continuous semimartingale; 
in particular its local martingale part is given by 

fvf{X s )dM s 
Jo 

where V/(x) is some choice of subgradient of f at x, such that f{Xt) £ raTt for all 
t > 0. 

The first part of the theorem stating that f(Xt) is a semimartingale was proved by 
Meyer [11] and later by Carlen and Protter [3]. Meyer just proves that f(Xf) is a semi- 
martingale, while Carlen and Protter express the martingale and finite variation process 



parts of the decomposition in terms of certain limits. Neither of the papers however give 
an explicit semimartingale decomposition of f(Xt). In [2 J Bouleau took a step further 
and proved that at each x £ dom(/) there exists a choice H (x) of subgradient V/(x) of 
/ such that the martingale part of the decomposition of f(Xt) can be expressed as an 
Ito stochastic integral J H(X)dM. In the follow-up paper [3] he proves the conjecture 
stated in [2] that in fact any measurable choice of H(x) can be used. In this note we are 
proving the first of the two results using an approach completely different to that in [2] . 

There are many other papers on extending the Ito's formula by considering different 
classes of functions / or stochastic processes or both. In |15j . for example, Russo and 
Vallois derive Ito's formula for C 1 (M d )-functions of continuous semimartingales whose 
time-reversals are also continuous semimartingales. They also extend the formula to the 
case of C 1 (IR <i )-functions with first order derivatives being Holder-continuous with any 
parameter and the process given by a stochastic flow generated by a so-called C°(R d , M. d )- 
semimartingale. In both cases the quadratic variation process is expressed in terms of the 
generalised quadratic covariation process (f (X), X) t introduced by the authors in an 
earlier paper [HJ (see also paper by Fuhrman and Tessitore |9j , where authors extend the 
notion of the generalised quadratic covariation further to the infinite-dimensional case 
and non-differentiable functions). In [8] Follmer, Protter and Shiryayev consider the case 
of an absolutely continuous function / with a locally square integrable derivative and X 
a 1-dimensional Brownian motion, for which a version of Ito's formula is derived with the 
finite variation part expressed again in terms of the quadratic covariation (f'(B),B)t- 
The multidimensional case (where / belongs to the Sobolev space W 1 ' 2 ) is treated in [7]. 
In |10j Kendall discusses semimartingale decomposition of r(B), where r is the distance 
function of Brownian motion on a manifold. The problem tackled in [ID] is similar to 
ours as r fails to be differentiable on a set of measure zero, called the cut-locus. It is 
proved in [TO] that r(B) is a semimartingale and its canonical decomposition is found 
explicitly in the sequel [5]. 

The layout of the paper is as follows. In section 2 and 3 we introduce some notation 
and preliminary results concerning convex functions, including some important theorems 
on differentiability: in particular in section 3 we explain that a proper convex function 
is everywhere differentiable (i.e. has a unique supporting hyperplane) except on a set 
of measure zero. Hence, by virtue of observing that a Brownian perturbation of our 
semimartingale xj: = X% + eBt has a probability density at every time t, we show 
that for convex / the gradient \7f(X^) is defined for all t almost everywhere. To 
show that the martingale part of f(X^) is / V/(X (e) )dM (e) , where M (e) = M + eB 
and V/ is some choice of subgradient, we approximate / by a sequence of C 2 convex 
functions f n : M. d — > M, n > 1. The martingale part of each f n (X^) is known explicitly 
from Ito's formula and its convergence to V/ almost everywhere is ensured by results of 
section 3. Convergence of the stochastic integral / 'V ' f n {X^)dM {e) to / V/(X (e) )dM (e) 
is ensured by a result of Carlen and Protter [4] . We conclude by proving the convergence 
lim e ^o/V f(X&)dM& = JVf{X)dM. Section 4 deals with a special case when / is 
piecewise linear. By proving a generalised version of Meyer- Tanaka formula we find the 
local martingale part of f(Xt) and thus prove Theorem 1 for such /. We conclude by 
giving a particular example of a subgradient that satisfies Theorem 1. 

2 Convex functions: some notation and results 

In order to prove the main result of this report we need to introduce some notation 
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and results from the differential theory of convex functions. Proofs of the results stated 
in this section and more details on convex functions are given in [13]. Let / be any 
function living on R d and taking values in [—00, +00]. At any point x G R d we define 
the one- directional derivative of f with respect to a vector y G R d , if it exists, as follows 

n/ , u 1 ,. f(x + Ay) - f(x) 
Df(x)[y] := hm 

The two sided derivative at x in direction y exists if and only if Df(x)[— y] is well defined 
and 

Df(x)[y] = -Df(x)[-y] (1) 

Now if the function / is convex, then the one-directional derivative always exists and 
moreover we may write 

D f(x) [y] = mf (2) 

Furthermore Df(x)[y] is positively homogeneous (i.e. Df(x)[Xy] = XDf(x)[y] for A G 
(0,oo)) and convex in y with Df(x)[0] = [Ml Thm. 23.1] and 

Df(x)[y] > -Df(x)[-y] (3) 

We also mention the upper semicontinuity of the one-sided derivative of a convex / with 
respect to y: 

Df(x)[y] = limsupD/(x)[z] > lim Df(x)[z] for any x G M. d 

z^ty z ~ >, y 

If for a general / all directional derivatives exist and are two-sided and finite then 
we define the gradient of / at x = (xi, ...,Xd) by 

v /W := (g(.),..,|£c«: 

and for any non-zero vector y = (yi, ...,yd) we define its directional version by 

df df 
(Vf(x),y) : =_ (a; ) lft + ... + —(x) W 

Also note Df(x)[y] = (Vf(x),y) for all y. 

Of course a general convex function / is not necessarily everywhere differentiable, a 
simple example being f(x) = \x\ which is not differentiable at x = 0. We can however 
define a set of subgradients at such a "troublesome" point. 

Definition 2. Let f : R d -> R be a convex function. A subgradient Vf(x) of f at 
x £l <i is a gradient of an affine hyperplane h(x) = a + P T x, a, /3 G R d , passing through 
the point (x,f(x)) and satisfying 

h(x) < f(x) 

for all other x. We denote any subgradient at x with respect to y G W 1 by (Vf(x),y). 
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We say h(x) is a supporting hyperplane of / at point (x,f(x)). Clearly at differen- 
tiable points h(x) is unique and is just the tangent of /. Conversely, at points where / 
is not differentiable we can construct infinitely many tangential hyperplanes h(x). The 
set of all subgradients at x is called the subdifferential of / at x, denoted df(x). Hence a 
convex function with finite values is subdifferentiable everywhere. In subsequent sections 
we will need the following result 

Theorem 3. ( }13\ Thm. 23.2]) Let f be a convex function and x a point at which f is 
finite. Then Vf(x) is a subgradient of f at x if and only if 

Df{x)[y] > (Vf(x),y) Vy G R d \{0} (4) 

The above says that a subgradient at x in the direction of y will always be less or 
equal to the one-sided directional derivative at x with respect to y. Relation Q is called 
the subgradient inequality and can be used as an alternative definition of a subgradient. 

3 Differential theory of convex functions 

This short section is devoted to studying the set T> of points in the domain of / at 
which the supporting hyperplane is unique. It is known |13|. Thm. 25.2] that in order to 
have a unique supporting hyperplane it suffices for the partial derivatives with respect 
to the basis vectors of M. d to exist. Furthermore it turns out Thm. 25.4] that the 
set of points at which / fails to have a two-sided directional derivative has measure zero 
and Df{x)[y\ is a continuous function of x on the set T> y , y ^ 0, of points at which 
Df(x)[y] = -Df(x)[-y]. 

Now suppose {ei,...,ed} is a basis of M. d . Let T>i be a subset of M. d which consists 
of points where the two-sided derivative in the direction of ej df/dxi{ei) exists and let 
T>f be its compliment in int(dom/). Here dom/ is the effective domain of /, which 
consists of values of x at which f{x) is finite, i.e. dom/ = {x G M. d ; — oo < f(x) < +oo}. 
Then D = T>\ D ... D T>d- Now by |X3|, Thm. 25.4] T>f has measure zero for all i. But 
D c = T> 1 U ... U2?^ is then the union of null sets is also a null set. Hence T> c has measure 
zero. Finally, since each df/dxi is continuous on its corresponding D{ (again by [13^ 
Thm. 25.4]) V/(x) = (df/dx\, ...,df/dxd) is continuous on V. 

Thus we see that set T> c of points at which a convex function / does not have a 
unique supporting hyperplane has measure zero. Subsequently any process which has a 
probability density at each time t spends time of measure zero in T> c , an important fact 
we will use in the sequel. 

To prove Theorem [T] for a general convex / we will approximate it by a sequence of 
twice continuously differentiable convex functions f n : M. d — > M, to which we know Ito's 
formula can be applied. So to conclude this section we state a result saying that for such 
a sequence {f n }n>i with f n — > /, Vf n (x) converges uniformly to V/(x) for all x G V. 

Theorem 4. (]13\ Thm. 25.7]) Let f be a convex function defined on M rf and {f n } a 
sequence of smooth convex functions on M. d such that lim^^oo f n {x) = f(x) Vx G R d . 
Let T> CM** be the set of points where f is differentiable. 

lim Vf n (x) = V/(x) Vx G V (5) 
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This result will be used several times in sections 5 and 6. 

4 Piecewise linear convex functions and Meyer-Tanaka formula 

In this section we start our analysis of the martingale part of f(X). However instead 
of treating the case of a general convex / we first prove Theorem 1 in a special case 
when / is piecewise linear. Recall the simple one-dimensional example of f(x) = \x\. 
Suppose X is a continuous martingale with canonical decomposition Xt = Xq + Mt + At . 
Then we cannot apply the usual Ito's formula to f(X) since / is not differentiable at 
x = and so df/dx is not well defined at the origin. The way around this problem is to 
choose the gradient at the origin from a set of possible subgradients ranging from -1 to 
+1. This is exactly what the Meyer-Tanaka formula does (Tanaka formula, if X = B is 
a standard Brownian motion.) 

Theorem 5. (Meyer-Tanaka formula for continuous semimartingales) Let X be a con- 
tinuous semimartingale. Define the function sgn(x) to be — 1 if x < and 1 otherwise. 
Then f(X), where f{x) = \x\, is again a semimartingale and in particular 

\X t \ = \X Q \+ f sgn(X s )dX s + L° t 
J o 

where L® is the local time of X at 0. 

So in this case V/(x) is the left-hand side derivative. Moreover, because as we know 
Brownian motion spends zero time in Lebesgue-null sets, we can in fact choose V/(0) to 
be any number in the interval [—1, +1]. Using the Meyer-Tanaka formula we can prove a 
more general result. Namely we will prove that any piecewise linear convex function of a 
continuous semimartingale is itself a continuous semimartingale and find the martingale 
part of the decomposition explicitly. 

Proposition 6. Let X = (X 1 , X d ) be a continuous semimartingale living on M. d , 
with i th component having decomposition X\ = Xq + M\ + A\, i G {l,...,d}. Let f : 
R d — > R be a function defined by f(x) = h(x) V ... V h{x), x G R d , where li(x) = 
OL% + Yfj=\ Pij x j = a i + Pl x > a ii ^ G ^ d > i G {!) ^} an d xV y := sup{x, y}. Then 
f(X) is a semimartingale with decomposition 

k t 

f(X t ) = f(X ) + J2 [ l Bi {X s )PjdX s + l -L t (6) 

where Bi = {x : min{fc : supj{lj(x)} = lk(x)} = i} and L t is an increasing process, 
constant on the complement of {t : h(Xt) = lj(Xt) for any i / j}. In particular the 
local martingale part of f(X) is given by 

[ l Bi (X s )PldM s (7) 
i=i Jo 
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Proof. We prove the proposition for the case when k = 2 and any d > 1 and general 
case follows by induction. Consider f(x) = h(x) V h{x). Denote h(X t ) = Y t and 
h(Xt) = Zf. Since Xt is a continuous semimartingale so are affine functionals, Yt and 
Zt, of Xt- Let the corresponding decompositions be Y = M + A and Z = N + S. 
Consider f(x) = l\{x) V li[x) = y V z. We can rewrite y V z as follows 

1 n \ 

y v z = 2 (ly ~ z l + y + z > 

Hence, using the differential notation for simplicity, we obtain 

d(Y t V Zt) = ^d(\Y t - Z t \ + Y t + Zt) = ] 2 (d(\W t \) + dY t + dZ t ) 

where W := Y — Z , and so W = (M — N) + (A — S). Using Meyer- Tanaka formula the 
above becomes 

i (sgn(Wt)dW t + dL° t + dY t + dZ t ) 
where L® is the local time of W at 0. Next 

i (sgn(W t )d(M i - N t ) + sgn(Wt)d(A t - S t ) + d(M t + A t ) + d(N t + S t ) + dL° t ) = 

= X - [(sgn(y t - Z t ) + l)dM t - (sgn(Y t - Z t ) - l)diV t + 

+ (sgn(Y t - Z t ) + 1) dA t - (sgn(y t - Z t ) - 1) dS t + dL° t ] 

Now sgn(Wt) = sgn(y t - Z t ) = l[Y t >z t ] ~ 1 \Y t <z t ] and so sgn(H / t ) + 1 = 21 [Yt>Zt ] and 
sgn(Wt) — 1 = —21ry t <Zt]- Hence we obtain 

d{Y t V Z t ) = l [Yt>Zt] dM t + l [Yt < Zt] diV t + l [Yt>Zt] dA t + l[y t <^]d5 t + X -dL t = 

= l\Y t >Zt]dY t + l[Y t <z t }dZt + -dL t 



or 

Yi V Z t = Y V Z + / l [ys>Zs] dY s + I l [Ys < Za] dZ s + JdL t 
jo jo 1 

where Lt is a continuous increasing process, constant on the complement of {t : h(Xt) = 
h(Xt)}- The above expression is exactly ^ for n = 2. Noticing that xVyVz = (xVy)Vz, 
the general case follows by induction. 

□ 



Clearly the integrand in ([7]) is a measurable selection of the multivalued map df(x) 
and so Theorem 1 holds in the special case of convex piecewise linear functions. To 
illustrate this result we consider our simple example again: for f(x) = \x\ d = 1, k = 2 
and l\{x) = —x and h(x) = x and so B\ = {x : x < 0}, Bi = {x : x > 0} and L t is an 
increasing process constant on the complement of {t : Xt = 0}. 
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This result, although not essential, is a nice warm-up before we start dealing with 
a more general situation in the next sections. We refer reader to |12^ Ch. VI. 1] for a 
detailed discussion of classical Tanaka and Ito-Tanaka formulas (for d = 1). One might 
also find a discussion of convex functions in section 3 of the Appendix of |12j useful. 

5 Semimartingale decomposition of f{X t ) 

We are now ready to start the analysis of the general case of a convex / defined 
over the whole of Euclidian space M. d . Let X be a continuous semimartingale with 
decomposition X = M + A defined on some filtered probability space (O, .F^P). Let 
(£l,Ft,F) be some enlargement of this space such that B is an (J^-standard Brownian 
motion independent of X. Define the perturbed process X on ($l,Ft,F) by 

X[ t] :=X t :=X t + eB t ; e > 0, t>0 

For simplicity of notation we shall suppress the superscript (e) wherever possible. For 
simplicity also but without loss of generality we can assume that Xq = Xq = 0. 

In this section we find the martingale part of f(X^) explicitly in order to take the 
limit as e — > in the next section and hence prove Theorem [T] The reasoning behind 
adding a small amount of Brownian motion to X% is as follows: we know very little about 
the behaviour of Xt as it is a general semimartingale. For instance, it can at some times 
be trivial, i.e. constant. Hence it might spend positive amount of time in those points 
where / is not differentiable, that is where it has more than one supporting hyperplane, 
with positive probability. To avoid this happening we perturb Xt by adding eBt- Then 

Lemma 7. X t has a probability density at each t > and in particular spends zero time 
in any null set. 

Proof. It suffices to prove that F(X t £ N) = for any t > and N CR with Leb(N) = 0. 
Then P is absolutely continuous with respect to Lebesgue measure and the corresponding 
Radon-Nikodym derivative is the probability density of Xt- So, for any Lebesgue- null 
set -/V we have 

F{X t £ N)=E |p(X t e AT| F t 

where T% = o~({X s ; < s < t}) and we use the tower property of conditional expectation. 
Next we express Xt in terms of Xt and Bt and use the fact that Bt is independent of 
Xt, and hence of Ft, to obtain 

E[P(X t + eB t £ N\F t )] = f F(x + eB t G N)dm(x) 

where fa is the law of Xt- Observe that Bt := x + eBt is a Brownian motion starting at x 
with (B t ,B t ) = e 2 t 2 . But we know that Brownian motion hits null-sets with probability 
zero. Hence the above integral is equal to zero and the lemma is proved. 

□ 

In section 3 we have seen that, P c , the set of points at which / is not differentiable 
is Lebesgue-null. Consequently X spends zero time at those "ambiguous" points in D c 
(or at least spends zero time traveling in "ambiguous" directions in which a gradient 
is not uniquely specified). Hence V/(X) is almost surely everywhere defined. Because 
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of Lemma [7] a particular measurable choice of V/(x) G df(x) at x G 2? c is therefore 
unimportant as it does not change the value of the stochastic integral J Q V f(X s )dM s , 
which we will show is the martingale part of f{X). To do that we approximate / by a 
sequence of convex twice continuously differentiable functions. 

Let {f n } n <i be a sequence of C 2 convex functions defined on M. d . We equip the set of 
convex functions on M. d with uniform convergence on compact sets with the correspond- 
ing metric p, defined by p(f,g) = J2T=i 2 ~ k Pk(f,g) where 

f . N sup N<jfe |/(z) -g(x)\ 
pk(f,g) ~ 



1 H-sup| x ,< & - g(x)\ 

Let f n increase to / and limn^oo p(f n , f) = 0. From Theorem 4 we know that V n /(x) 
converges to V/(x) for all x G T> and from Lemma 7 that particular choices of V/(x) for 
x G T> c are unimportant when we deal with Brownian perturbation Xf. We now have to 
prove that stochastic integral J^V f n (X s )dM s , the martingale part of f n (X), converges 
in some sense to J *V f(X s )dM s and that it is indeed the martingale part of f(X). It 
turns out that the convergence is in Ti 1 norm: for a continuous semimartingale X with 
decomposition X = M + A we define 



X \\ w = \\ (M,M)V 2 + / \dA s \ \\ L p 

Jo 



The % p -space consists of all semimartingales X such that || X \\y,p< 00. Once the 
convergence is established, the fact that J V f{X)dX is a local martingale part of f(X) 
will follow from [H Thm. 2] of Carlen and Protter. 

Suppose (X")„>i is a sequence of continuous semimartingales with the decomposi- 
tion X n = X r l + M n + A n , such that Hrm^ E[(X n - X)*} = 0. Here X* = sup t \X t \. 
Barlow and Protter prove ([U Thm. 1]) that under some regulating conditions imposed 
on M n and A n not only that the limiting process X is again a continuous semimartingale 
but that there is also convergence of the corresponding martingale and finite variation 
process parts of the decompositions. More specifically the topological space H l of local 
martingales is complete and so is A 1 , the space of processes of finite variation with norm 
||A||_4i = || Jq 00 |gL4 s ||| l i (see Emery [B]). 

In [1] Carlen and Protter prove that the assumptions of [U Thm. 1] are satisfied in 
case when the sequence of C 2 convex functions {f n }n>i of (a not necessarily continuous) 
semimartingale X = M + A converges (increases) to a convex /, thus making the result 
applicable in our situation. 

We need to note the following two inequalities 



sup sup |V/ n (x)| < C r < oo; W > (8) 

n \x\<r 



and 



sup \Vf{x)\ < C r < 00; Vr > (9) 

|z|<r 

where C r is some constant only depending on r. To see why inequality ^ is true first 
notice that since linin^oo p(f n , f) = the variation of /„ is uniformly bounded in n on 
{|a;| < r + 1} for any r > 0. Denote this bound by C r . Let x n be such that 



S 



V/ n (iCn) = SUp |V/ n (x)| 
\x\<r 

and let u n := V/ n (x n )/|V/„,(x„)|. Then 

|V/„(x n )| 

< jnf < + U n ) - fn{Xn) (10) 

A>0 A 

But since \x n + u n \ < r + 1 the above is less than C r for all n and ^ follows. 

Now since f n converges to / uniformly on compact sets we also have f n —tf point- 
wise. Therefore for any x,y with |x|,|y| < r + 1 the inequality f n {x) — f n (y) < C r , 
n > 1, implies /(x) — f(y) < C r by virtue of taking the limit n — > oo. Expression ^ 
then follows by the same argument we used to prove Q. 

We are now ready to prove the following 
Lemma 8. The local martingale part of f(Xt) is given by the limit 

I Vf(X s )dM s = lim / Vf n (X s )dM s (11) 
Jo n ^°° Jo 

locally in T-L\, where V/(x) G df(x) is some measurable choice of subgradient of f at x. 

Proof. Since for each n > 1 f n is in C 2 , the martingale part of f n (X) is given by 
JVf n (X)dM, where M = M + eB. The result of Carlen and Protter [U Thm. 2] 
ensures that the martingale part of the limiting process f{Xt) is given by the limit of 
J V f n (X)dM as n tends to infinity, locally in 14} . Our aim is to prove that this limit is 
given by J V f(X)dM for some measurable choice of subgradient V/ G df. 

Since X is a continuous semimartingale, by means of localisation we can assume that 
X is contained in an open ball of radius r > centered at the origin, denoted by B(r). 
Localisation can also be done in such a way that the localised process is simultaneously 
in H 1 ; we know that continuous semimartingales are at least locally in H . In particular 
we assume that (M,M) t is bounded in £>(r). 

Define T r = inf{t : X t ^ B(r)} and consider the stopped process X fA ^ . By Lemma 

jT] within ball B(r) the localised process has the density. It doesn't matter if Xj, is not 
in T> since the value of V f(X) at time T r does not change the value of the integral 
f Q Tr Vf n (X t )dM t . 

Notice that convergence of a continuous (local) martingale M in % p is equivalent to 
convergence of (M, M) 1 / 2 in C p , and so convergence in W implies convergence in H l for 
1 < I < P in this case. In our case it is easier to prove convergence (11) in H 2 and then 
deduce convergence in H} . Now, for any measurable selection V/ G df we have 

lim || (vf n (Xt)-Vf(Xt)) dM t \L 2 



lim E 

n— >oo 



" 'vf n {X t )- Vf(X t )) 2 d(M,M) t 







1/2 

(12) 
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Using inequalities j[8]) and ([9j we see that the integrand in (12) is bounded above 
by AC 2 . Recall that {M, M)t must also be bounded in B(r). Hence both the integrand 
and the expression inside the expectation sign must be bounded and we can use the 
bounded convergence theorem to pull the limit inside the expectation and the integral 



sign, so that (12) is equal to 



E 



Tr 



lim 

n— »oo 



Vf n (X t )-Vf(X t )) d{M, M) 



1/2 



We can then use almost sure convergence of V/ n (Xj) to Vf(Xt) for all Xt G T> 
and the fact that particular choices Vf(Xt) G df(Xt) for Xt £ T> c are not charged by 

the integral to conclude that J r V f n (X s )dM s converges to J Zr Vf(X s )dM s in U 2 and 
hence in H 1 . This is true for any radius r > of the region of localisation B(r) and so 
([IT]) follows. 

□ 



6 Proof of Theorem 1 

Finally we need to derive the analogous result for our original object of interest, 
continuous semimartingale X. 

Proof of Theorem 1. We have lim^o X^ e ' = X almost surely and thus lim^o f(X^ e ') = 
f(X) almost surely. Note that the limiting process X^ as e tends to zero lives 
in the enlarged probability space (Q, J 7 , P), even though the original process X 
is defined on (f2, J 7 , P). Crucially by Ito's lemma f(X^) is a continuous semi- 
martingale for every e > 0. Hence we can apply result of Barlow and Protter [TJ 
Thm. 1] if we can show that the conditions of the theorem are satisfied in our 
case. Luckily most of the work has been done for us in [4] and we just have to 
slightly amend the arguments therein. 

We first need to suitably localise our process. Let £>(r) be an open ball of 
radius r and B(r') an open ball of radius r' with r' > r > 0. For all r, r' > define 
stopping times T r := inf{t : X t B(r)} and T r i := inf{t : X t ^ B(r')} and take 
T — T r A T r i. Assume also that X tA T, Xt/\T G ^ 1 for all t > and in particular 
that (M,M)t is bounded in B{r). We consider the stopped process X 4A y. Note 
that X t AT £ B{r) C B{r') and X tA r £ B{r') for all t > 0. The localised process 
is absolutely continuous until it is stopped at time T. Again the value of X at T 
does not affect the stochastic integral J^V f(X sA T)dM sA T = V f(X s )dM s . 

In what follows we will need the fact that / is Lipschitz in the ball B(r) for 
any r > (see for example [T3l Thm. 24.7]). Hence by calculation similar to ( f!o| 
we have 

\Vf(X tAT )\<C r , <oo; t>0 (13) 

where C T > is some finite constant only depending on r'. 

Now to apply results of Barlow and Protter we need to prove 
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limE 



sup|/(X t (e) )-/M 

t<T 



supE 

e 

supE 



sup \N t 

t<T 



(«)| 



\dSi 



< K r 



< K r 



(14) 
(15) 
(16) 



where := N and := 5 are the martingale and finite variation parts of 
the semimartingale decomposition of f(X) respectively and K rr i is some finite 
constant which only depends on r and r'. 



We first look at expression (14). Since / is Lipschitz in the ball B(r') we have 



sup\f(X t )-f(X t )\<eK r ,sup\B t \ 

t<T t<T 

where K r > < oo is a Lipshitz constant depending on r'. Taking the limit e — > 



gives (14). 



To prove (15) first note that by Lemma M for each e > the martingale part 



of f(X&) is = JVf(X^)dM^ where V/ is some measurable choice of 
subgradient. Then by the Burkholder-Davis-Gundy inequality we have for some 
constant p < oo 



E 



sup |iV t | 

t<T 



<pE 

= pE 
< pC r ,E 



(N,N) 



1/2 
T 



T _ ~ ~ N 1/2 ' 

\Vf(X t )\ 2 d(M,M) t 



(M, M)y /2 



where the second inequality follows by inequality ( |13j . To finish we need to bound 
(M, M)t by some constant independent of e. Now (M, M)t = (M, M)t+cT which 
for sufficiently small e, and hence eventually for all e, is less than (M,M)t + T 
which is in turn bounded above by (M, M)x+T r , since T r > T = T r AT r '. Moreover 
recall that (M, M) t AT, and hence (M, M)t, is bounded in B{r) by some constant 
only depending on r. Thus for all sufficiently small e (M, M)t is bounded above 
by some constant that only depends on the radius r (and not on e) and (JTHj) follows. 



Proof of (16) mimics the argument in Carlen and Protter [U pp. 4-5], modulo 
obvious simplifications to allow for the fact that our case is continuous and using 
the fact that 

\f(X tAT ) - f(X )\ < K r ,\X tAT - X \ < r'K r , 
for some constant K r > < oo by Lipschitz-continuity of / in B(r'). 
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Now that we have verified conditions (14)-(16) we can apply [TJ Thm. 1] to 
our case. It follows immediately that the martingale part of f{X) is given by the 
limit as e ^ of JVf(X^)dM^\ the martingale part of f(X {e) ), locally in Hi. 
All is left to prove now is that this limit is given by J V f(X)dM for some choice 
of Vf{x) G df(x), i.e. that for alH > 



lim f Vf{X^ T )dM% = [ Vf{X sATr )dM sATr (17) 
4° Jo Jo 

in H 1 . _ 

Proving the above convergence will require us to consider the limit of Vf(Xj: AT ) 
as e tends to 0. Of course in general when lim^oo x n = x for x G domf and 
x n G domf Wn > 1, lim^oo V/(x n ) need not exist. However the situation when 
x n = x + e n y for some y G M d and e n — > as n — > oo, i.e. when x n approaches x 
from a single direction y, is special. In this case it is known that Vf(x n ) converges 
to the part of the boundary of df(x) consisting of points at which y is normal to 
df(x). Moreover 

Lemma 9. Let f : R d -> R be a convex function. For any x G Mr, for almost all 
y G S 7 where S 11 ' 1 is the unit sphere in M d , 

lim V/(x + ey) 

exists, belongs to df(x) and is unique for any selection Vf(x + ey) G df(x + ey) 
we may make from the subdifferential of f at x + ey for any e > 0. 

Proof. See appendix. □ 

Using the above result we see that for all t > for almost all values of B t 
the limit lim^o V/(X t + eB t ) exists and belongs to df(X t ). Denote this limit by 
V/(Xf). Also for any path of X and B for small enough e, i.e. eventually, we 
have T r < T r i. So T — > T r as e — > a.s. and 

lim Vf(X tAT + eB tAT ) = Vf(X tATr ) a.s. (18) 

Again we consider convergence in H 2 first and convergence in "H 1 follows. We 
have, using the fact that lim^o M t AT = hni e -5.o M tA T 



t pt 



lim / Vf{X sAT )dM sAT - / Vf{X sATr )dM ! 



40 ./() JQ 

limE / 

Jo Jo 



l sAT r | |^2 



eiO 



Vf(X s ) 2 d(M s ,M s }+ [ \f(X s ) 2 d(M S} M s ) 

Jo 

poo ^ 

2 / Vf{X sAT )Vf{X sATT )d{M sAT ,M sATr ) 
Jo 



1/2 

(19) 
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Once again we can use inequality (13) to see that the first and third integrands 
in (19) are bounded above by C r 2 , < oo and C r C r i < oo respectively The integrals 



themselves are bounded since (M, M) is bounded in B(r). Thus we can interchange 



the limit with the expectation and the integration signs. Convergence (18) and 



the fact that T — > T r a.s. then finally yield (17). Noticing that the above is true 
for all r' > r > concludes the proof. 

□ 

As was mentioned before, in j3] Bouleau has proved that any measurable choice 
of subgradient V/(X t ) works for the stochastic integral of Theorem [l] A function 

V e f(x) = limE[V/(x + ON)] (20) 

where N is a standard d- dimensional Gaussian random variable, is a particular 
example. V f(x) can be regarded as a sort of an average of (sub)gradients within 
the vicinity of x. To verify that it does indeed define a subgradient of / at each 
x G lR d we check the subgradient inequality ^ of Theorem [3j For any y e M d \{0} 
we have 

(Tf(x),y) = (limE[V f{x + 9N)],y) = ]jmE[(Vf(x + 9N),y)] (21) 

Now N is almost surely finite and also x + ON e B(\x + 9N\) C B(\x\ + \N\) for 
small enough 9 and so eventually for all 9. Hence by the Lipschitz property of / 
and by the subgradient inequality (J4J) we have 

(V/(x + »N),y) < D( X + 0N M = inf f{x + 9N + ™ - f{x + 

A>0 A 

< f{x + 9N + y)- f{x + 9N) < K\y\ 

for Lipshitz constant K < oo depending on x and N. Appealing to the bounded 
convergence theorem now allows us to take the limit inside the expectation in 



equation (21) above 



(V f(x),y) =E[Q}mVf(x + 9N),y)] (22) 
According to Lemma 9 lim^o V/(x + 9N) exists, is unique and belongs to 



df(x) for almost all N. Denote this limit by z^. Then (22) equals 



E[(z N ,y)}<E[Df{x)[y}} = Df(x)[y] 

Hence we have (V e /(x), y) < Dfix)[y\ for any y e ]R d \{0} for all x and so V 6 /(x) 
is a well defined subgradient of /. 



Appendix: Proof of Lemma 9 
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Proof. First of all recall that Df(x)[y] = \im e ± (f(x + ey) — f(x))/e is a positively 
homogenous function, convex in y with Df(x)[0] = 0. Let g{y) := Df(x)[y]. 
Hence Vg(Xy) exists and is unique for all A > for almost all y G M, d . Fix 
x, y G M. d and without loss of generality, by adding a suitable affine function to /, 
assume that 

f(x) = g(y) = Vg(y) = 

The fact that the limit lim^ V/(x + ey), if it exists, belongs to df(x) follows 
from [131 Thm. 24.5.1]. To prove existence we argue by contradiction. If theorem 
fails then we can find a subsequence e n — > and a selection V/(x + e n y) G 
df(x + e n y) such that 



lim V/(x + e n y) = h^0 

n— >oo 

and also a vector M6l d with (h,u) > 0. For such u consider 

fix + e n y + e n \u) - f(x + e n y) _ f(x + e n y + e n \u) - f(x + e n y) 
— A 1 

Using ^ and homogeneity of g(y) the above is greater or equal to 



(23) 



A 



Df(x + e n y)[e n u] = XDf{x + e n y)[u] > A(V/(x + e n y),u) > \{h,u) + 0{l) 



where the last two inequality signs come from expressions Q and (23) respectively. 
Thus we obtain 



f(x + e n y + e n \u) - f(x + e n y) 



> X(h,u) + 0{1) 



(24) 



where 0(1) — > as n — > oo. On the other hand, since f(x) = g(y) = 0, we have 
f(x + e n y) - f(x) f(x + e n y) 



0{l) 



(25) 



Hence combining (24) and (25) obtain 



f '{x + e n y + e n \u) - f(x) _ fjx + e n y + e n \u) - f(x + e n y) f(x + e n y) - f{x) 

> X(h,u) + 0(1) 
Letting n — > oo, i.e. e n — > 0, the above inequality becomes 



Df{x)[y + Am] = g(y + An) > X(h,u) > 
g(y + Xu) g(y + Xu)-g(y) 



A 



A 



> (h,u) > 
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And so letting A — > one obtains 

(Vg(y),u) > (h,u)>0 
But this contradicts the assumption that Vg{y) = 0. 

□ 
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